Fast cross-validation via sequential testing

نویسندگان

  • Tammo Krueger
  • Danny Panknin
  • Mikio L. Braun
چکیده

With the increasing size of today’s data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible, the method speeds up the computation while preserving the power of the full cross-validation. Theoretical considerations underline the statistical power of our procedure. The experimental evaluation shows that our method reduces the computation time by a factor of up to 120 compared to a full cross-validation with a negligible impact on the accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Cross-Validation via Sequential Analysis

With the increasing size of today’s data sets, finding the right parameter configuration via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses non-parametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming cand...

متن کامل

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...

متن کامل

Software-based Testing of Sequential VHDL Descriptions

In this paper, we propose a new high-level test pattern generation technique for sequential circuits. The main motivation is two-fold: on one hand, we elaborate test data for design validation; on the other hand, we deal with the problem of structural test development at functional level. The proposed test method, i.e. mutation testing, allows us to work with a fault model at software level on ...

متن کامل

Logic Design Validation via Simulation and Automatic Test Pattern Generation

We investigate an automated design validation scheme for gate-level combinational and sequential circuits that borrows methods from simulation and test generation for physical faults, and verifies a circuit with respect to a modeled set of design errors. The error models used in prior research are examined and reduced to five types: gate substitution errors (GSEs), gate count errors (GCEs), inp...

متن کامل

Multivariate Analysis of fMRI using Fast Simultaneous Training of Generalized Linear Models (FaSTGLZ)

We present an efficient algorithm for simultaneously training elastic-net-regularized generalized linear models across many related problems, which may arise from bootstrapping, cross-validation and nonparametric permutation testing. Our approach leverages the redundancies across problems to obtain ≈ 10x computational improvements relative to solving the problems sequentially by the standard gl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2015